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Abstract We consider the deterministic evolution of a time-discretized spiking network of 
neurons with connection weights having delays, modeled as a discretized neural network of 
the generalized integrate and fire (gIF) type. The purpose is to study a class of algorithmic 
methods allowing to calculate the proper parameters to reproduce exactly a given spike train 
generated by an hidden (unknown) neural network. 

This standard problem is known as NP-hard when delays are to be calculated. We pro- 
pose here a reformulation, now expressed as a Linear-Programming (LP) problem, thus 
allowing to provide an efficient resolution. This allows us to "back-engineer" a neural net- 
work, i.e. to find out, given a set of initial conditions, which parameters (i.e., connection 
weights in this case), allow to simulate the network spike dynamics. 

More precisely we make explicit the fact that the back-engineering of a spike train, is a 
Linear (L) problem if the membrane potentials are observed and a LP problem if only spike 
times are observed, with a gIF model. Numerical robustness is discussed. We also explain 
how it is the use of a generalized IF neuron model instead of a leaky IF model that allows 
us to derive this algorithm. 

Furthermore, we point out how the L or LP adjustment mechanism is local to each unit 
and has the same structure as an "Hebbian" rule. A step further, this paradigm is easily gen- 
eralizable to the design of input-output spike train transformations. This means that we have 
a practical method to "program" a spiking network, i.e. find a set of parameters allowing us 
to exactly reproduce the network output, given an input. 

Numerical verifications and illustrations are provided. 
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1 Introduction 

Neuronal networks have tremendous computational capacity, but their biological complexity 
make the exact reproduction of all the mechanisms involved in these networks dynamics 
essentially impossible, even at the numerical simulation level, as soon as the number of 
neurons becomes too large. On crucial issue is thus to be able to reproduce the "output" of 
a neuronal network using approximated models easy to implement numerically. The issue 
addressed here is "Can we program an integrate and fire network, i.e. tune the parameters, 
in order to exactly reproduce another network output, on a bounded time horizon, given the 
input". 

This is the issue addressed here. 



Calculability power of neural network models 

The main aspect we are interesting here is the calculability of neural network models. It is 
known that recurrent neural networks with frequency rates are universal approximators l35l . 
as multilayer feed-forward networks are [23 1. This means that neural networks are able to 
simulate dynamical systems as an example see the very interesting paper of Albers-Sprott 
using this property to investigate the dynamical stability conjections of Pales and Smale in 
the field of dynamical systems theory [ 3 ] or route to chaos in high dimensional systems Q, 
not only to approximate measurable functions on a compact domain, as originally stated 
(see, e.g., ll35l for a detailed introduction on these notions). Spiking neuron networks have 
been proved to be also universal approximators (29). 

Theoretically, spiking neurons can perform very powerful computations with precise 
timed spikes. They are at least as computationally powerful, as the sigmoidal neurons tra- 
ditionally used in artificial neural networks [28 30]. This results has been shown using a 
spike-response model (see J32 1 for a review) and considering piece-wise linear approxi- 
mations of the potential profiles. In this context, analog inputs and outputs are encoded by 
temporal delays of spikes. The authors show that any feed-forward or recurrent (multi-layer) 
analog neuronal network (a-la Hopfield, e.g., McCulloch-Pitts) can be simulated arbitrarily 
closely by an insignificantly larger network of spiking neurons. This holds even in the pres- 
ence of noise [28,30]. These results highly motivate the use of spiking neural networks, as 
studied here. 

In a computational context, spiking neuron networks are mainly implemented through 
specific network architectures, such as Echo State Networks [25 1 and Liquid Sate Machines 
[31], that are called "reservoir computing" (see l45l for unification of reservoir computing 
methods at the experimental level). In this framework, the reservoir is a network model of 
neurons (it can be linear or sigmoid neurons, but more usually spiking neurons), with a ran- 
dom topology and a sparse connectivity. The reservoir is a recurrent network, with weights 
than can be either fixed or driven by an unsupervised learning mechanism. In the case of 
spiking neurons (e.g. in the model of ll33l ), the learning mechanism is a form of synaptic 
plasticity, usually STDP (Spike-Time-Dependent Plasticity), or a temporal Hebbian unsu- 
pervised learning rule, biologically inspired. The output layer of the network (the so-called 
"readout neurons") is driven by a supervised learning rule, generated from any type of clas- 
sifier or regressor, ranging from a least mean squares rule to sophisticated discriminant or 
regression algorithms. The ease of training and a guaranteed optimality guides the choice of 
the method. It appears that simple methods yield good results [45 1. This distinction between 
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a readout layer and an internal reservoir is indeed induced by the fact that only the output of 
the neuron network activity is constrained, whereas the internal state is not controlled. 

Learning the parameters of a neural network model 

In biological context, learning is mainly related to synaptic plasticity lEH TB l and STDP (see 
e.g., (40) for a recent formalization), as far as spiking neuron networks are concerned. This 
unsupervised learning mechanism is known to reduce the variability of neuron responses [7 ] 
and is related to the maximization of information transmission BP and mutual information 
(T5) . It has also other interesting computational properties such as tuning neurons to react 
as soon as possible to the earliest spikes, or segregate the network response in two classes 
depending on the input to be discriminated, and more general structuring such as emergence 
of orientation selectivity [22|. 

In the present study, the point of view is quite different: we consider supervised learning 
while, since "each spike may matter" [22, 18 1, we want not only to statistically reproduce 
the spiking output, but also to reproduce it exactly. 

The motivation to explore this track is twofold. On one hand we want to better under- 
stand what can be learned at a theoretical level by spiking neuron networks, tuning weights 
and delays. The key point is the non-learnability of spiking neurons (50), since it is proved 
that this problem is NP-complete, when considering the estimation of both weights and 
delays. Here we show that we can "elude" this caveat and propose an alternate efficient 
estimation, inspired by biological models. 

We also have to notice, that the same restriction apply not only to simulation but, as far 
as this model is biologically plausible, also holds at the biological level. It is thus an issue 
to wonder if, in biological neuron networks, delays are really estimated during learning 
processes, or if a weaker form of weight adaptation, as developed now, is considered. 

On the other hand, the computational use of spiking neuron networks in the framework 
of reservoir computing or beyond (36), at application levels, requires efficient tuning meth- 
ods not only in "average", but in the deterministic case. This is the reason why we must 
consider how to exactly generate a given spike train. 



What is the paper about 

In the next section we detail the proposed methods in three steps: first, discussing the neu- 
ral model considered here, provided underlying assumptions are assumed; then, detailing 
the family of estimation problems corresponding to what is called back-engineering and 
discussing the related computational properties; finally, making explicit how a general in- 
put/output mapping can be "compiled" on a spiking neural network thanks to the previous 
developments. 

In the subsequent section, numerical verifications and illustrations are provided, before 
the final discussion and conclusion. 



2 Problem position: Discretized integrate and fire neuron models. 

Let us consider a normalized and reduced "punctual conductance based generalized integrate 
and fire" (gIF) neural unit model (T9) as reviewed in (34) . The model is reduced in the sense 
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that both adaptive currents and non-linear ionic currents are no more explicitly depending on 
the potential membrane, but on time and previous spikes only (see lfT2ll for a development). 

Here we follow [8 , 13 , 12 1 after [37 1 and review how to properly discretize a gIF model. 
The precise derivation is not re-given here, except for one innovative aspect, whereas the 
method is only sketched out (see [13,12] for details). 

Time constrained continuous formulation. Let v be the normalized membrane potential, 
which triggers a spike for v = 1 and is reset at v — 0. The fire regime (spike emission) reads 
v(t) = 1 =>> v(t + ) = 0. Let us write u t = {• • • tf • • • }, the list of spike times tf < t. Here 
tf is the n-th spike-time of the neuron of index i. The dynamic of the integrate regime reads: 

j n 

Here, 77, and E\ are the membrane leak time-constant and reverse potential, while pj () 
and Ej are the spike responses and reverse potentials for excitatory /inhibitory synapses and 
gap-junctions. Furthermore, Pj(s) is the synaptic or gap-junction response, accounting for 
the connection delay and time constant; it is assumed that the response vanishes after a delay 
7>, where Pj(s) : having the following shape: 

Finally, i m () is the reduced membrane current, including simplified adaptive and non- 
linear ionic current (see lTT2ll for details). 

The dynamic of the integrate regime thus writes: 

— +g(t,ut)v = i(t,ut), 

so that knowing the membrane potential at time t, the membrane potential at time t + 6, 
writes: 

v(t + S) = v(t, t + 6, Cot) v(t) + Jl +5 v(s, t + 5, Cds) i(s, uj s ) ds, 
log(i/(t ,*l,£t )) = - fto g(s,u s )ds. 

The key point is that temporal constraints are to be taken into account ifTOl . Spike-times 
are bounded by a refractory period r, r < d™ +1 , defined up to some absolute precision St, 
while there is always a minimal delay dt for one spike to influence another spike, and there 
might be (depending on the model assumptions) a maximal inter- spike interval D such that 
either the neuron fires within a time delay < D or remains quiescent forever). For biological 
neurons, orders of magnitude are typically, in milliseconds: 



r 


St 


dt 


D 


1 


0.1 


10 -ll,2J 


10 L3,4j 



Network dynamics discrete approximation. Combining these assumptions and the previous 
equations allows one (see fill for technical details) to write the following discrete recur- 
rence equation for a sampling period 5: 

n D 

Vi[k] = 7 i Vi[k - 1] (1 - Zi[k - 1]) + Yl W vd Z jl k - d ] + 7 ^ (!) 

j=ld=l 
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where Vi[k] = Vi(kS) and Zi[k] — C[i,+oo[(^ M)» where £ is the indicatrix function, 
€a( x ) — 1 if x £ ^ and otherwise. 

Let us discuss in detail how (Q} is derived from the previous equations. 

The term (1 — Zi[k]) implements the reset mechanism, since this term is equal 
to when Vi[k] > 1. The interesting technical point is that this equation entirely 
specifies the integrate and fire mechanism. 

The = v(t, t + 5, &t)\ t=k s term takes into account the multiplicative effects 
of conductances. The numerical analysis performed in [ 1 3 1 demonstrates that, for 
numerical values taken from bio-physical models, considering here S ~ 0.1ms, this 
quantity related to the contraction rate, is remarkably constant, with small variations 
within the range: 

7i e [0.965, 0.995] ~ 0.98, 

considering random independent and identically distributed Gaussian weights. It has 
been numerically verified that taking this quantity constant over time and neurons 
does not significantly influence the dynamics. This the reason why we write ji as 
a constant here. This corresponds to a "current based" (instead of "conductance 
based") approximation of the connections dynamics. 
The additive current 



rt+8 / £^ \ / £^ 

Jt V r L J ,_ ux V T L 



t=kS 



t=kS 



accounts for membrane currents, including leak. The right-hand size approximation 
assume 7^ is constant. Furthemore, we have to assume that the additive currents 
are independent from the spikes. This means that we neglect the membrane current 
non-linearity and adaptation. 

On the contrary, the term related to the connection weights Wijd is not straight- 
forward to write and now requires to use the previous numerical approximation. Let 
us write: 

W i:j [k - h$] = Ej Jl +5 zy(s, t + 5, uj t ) Pj (t - ds\ 



E 3 5 llPj (t-^)\ u 



:k6,t?=k? 5 ' 



assuming z/(s, t + 5, Cot) — 7i as discussed previously. This allows us to consider the 
spike response effect at time t™ = k™ S as a function only of k — kj . The response 
Wij[d] vanishes after a delay D,r r — D5, as stated previously. We assume here 
that S < St i.e. that the spike-time precision allows to define the spike time as 
= k™ 5 (see 1113111211 for an extensive discussion). We further assume that 
only zero or one spike is fired by the neuron of index j, during a period 6, which is 
obvious as soon as S < r. 

This allows to write W^d = Wij [d] so that: 

E"=i Wn [k - kj] = E?=i E"=i w^ [d]n {k?} (k - d) 
= T.d=iWij\d]i [k] ...^ t ... } {k-d) 
= E2=iW ijd Zj[k-d] 

since Zj [I] — ... ^ n ,---} (0 i s precisely equal to 1 on spike time and otherwise, 
which completes the derivation of {j}. 
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Counting the model's degrees of freedom. Let us consider a network of N units, whose 
dynamics is defined by (Q]), generating a raster of the form schematized in Fig.Q] 

In order to determine the dynamics of the neural network, we require the knowledge of 
the initial condition. Here, due to the particular structure of equation (Q} with a delay D, the 
initial condition is the piece of trajectory Vi[k], k e {0, D{. The notation k e {0, D{ stands 
for < k < D.ln fact, it is sufficient to consider V$[0] and Zj[k],k G {0,D{, entirely 
defining V { [k],ke {0, D{ from ©. 

If the neuron i has fired at least once, the dependence in the initial condition is removed 
thanks to the reset mechanism. This means that its state does not depend on Vi [0] any more, 
as soon as spikes are known. We thus can further assume Vi [0] = 0, for the sake of simplicity. 

The initial state is thus defined by N numerical values and D x N binary values. 

The dynamics is parametrized by the weights thus N x N x D values. Here it is 
assumed that the 7^ are known and constant, while I ik are also known, as discussed in the 
sequel. 

When the potential and/or spikes are observed during a period of T samples, N x T 
numerical/binary values are measured. 



3 Methods: Weights and delayed weights estimation 

With the assumption that Vi[0] = discussed previously, {TJ writes: 

N D r zk 

y M = EE w ijd E ^ T z ji k -r-d\ + i ikr (2) 

j=l d=l r=0 

writing I ikr = 7 r h(k-r) with: 

r ik = k- arg min l>0 {Zi[l - 1] = 1}, 
the derivation of this last form being easily obtained by induction from (Q]). Here r ik is 
the delay from the last spiking time, i.e., the last membrane potential reset. If no spike, we 
simply set r ik = k. 

This equation shows that there is a direct explicit linear relation between spikes and 
membrane potential. See (SJ for more detailed about the one-to-one correspondence between 
the spike times and the membrane potential trajectory that seems defined by (Q]) and ®. 
Here, we use © in a different way. 

Let us now discuss how to retrieve the model parameters from the observation of the 
network activity. We propose different solutions depending on the paradigm assumptions. 



Retrieving weights and delayed weights from the observation of spikes and membrane po- 
tential 

Let us assume that we can observe both the spiking activity Zi [k] and the membrane poten- 
tial Vi [k] . Here, ([2]) writes in matrix form: 



Ci W ? ; di 



(3) 
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with: 




e R 
e R 



e R 



>T—DxN D 



,T-D 



<N D 



writing the transpose of u. 

Here, is defined by the neuron spike inputs, is defined by the neuron membrane 
potential outputs and membrane currents, and the network parameter by the weights vector 

More precisely, Q is a rectangular matrix with: 

- N D columns, corresponding to product with the N D unknowns Wijd, for j £ {1, N} 
andde {1,£>}, 

- T — D rows, corresponding to the T — D measures Vi[k], for k G {D, T}, and, 

- T — D x N D coefficients corresponding to the raster, i.e. the spikes Zj [k]. 

The weights are thus directly defined by a set of linear equalities for each neuron. Let us 
call this a Linear (L) problem. 

Furthermore, the equation defined in ([5]), concerns only the weights of one neuron of 
index i. It is thus a weight estimation local to a neuron, and not global to the network. 
Furthermore, the weight estimation is given by the observation of the input Z^ [k] and output 
Vi [k] . These two characteristics correspond to usual Hebbian-like learning rules architecture. 
See ifTTll for a discussion. 

Given a general raster (i.e., assuming Q is of full rank min(T — D,N D)): 

- This linear system of equations has always solutions, in the general case, if: 



This requires enough non-redundant neurons N or weight profile delays D, with respect 
to the observation time T. In this case, given any membrane potential and spikes values, 
there are always weights able to map the spikes input onto the desired potential output. 

- On the other hand, if N D < T — D, then the system has no solution in the general 
case. This is due to the fact that we have a system with more equations than unknowns, 
thus with no solution in the general case. However, there is obviously a solution if the 
potentials and spikes have been generated by a neural network model of the form of {]]). 

If Ci is not of full rank, this may correspond to several cases, e.g.: 

- Redundant spike pattern: some neurons do not provide linearly independent spike trains. 

- Redundant or trivial spike train: for instance with a lot of bursts (with many Zj [k] = 1) 
or a very sparse train (with many Zj [k] = 0). Or periodic spike trains. 

Regarding the observation duration T, it has been demonstrated in |[8l[T3l that the dy- 
namic of an integrate and fire neural network is genericall>Q periodic. This however depends 
on parameters such as external current or synaptic weights, while periods can be larger than 
any accessible computational time. 

1 Considering a basic leaky integrate and fire neuron network the result is true except for a negligible set 
of parameters. Considering an integrate and fire neuron model with conductance synapses the result is true, 
providing synaptic responses have a finite memory. 




(4) 
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In any case, several choices of weights (in the general case a.D (N +1) —T dimen- 
sional affine space) may lead to the same membrane potential and spikes. The problem of 
retrieving weights from the observation of spikes and membrane potential may thus have 
many solutions. 

The particular case where D = 1 i.e. where there is no delayed weights but a simple 
weight scalar value to define a connection strengths is included in this framework. 



Retrieving weights and delayed weights from the observation of spikes 

Let us now assume that we can observe the spiking activity Z{ [k] only (and not the mem- 
brane potentials) which corresponds to the usual assumption, when observing a spiking 
neural network. 

In this case, the value of Vi [k] is not known, whereas only its position with respect to 
the firing threshold is provided: 

Zi[k] = Vi[k] < 1 and Z^k] = 1 Vi[k] > 1, 
which is equivalent to write the condition: 



e ik = (2Zi[k]-l) (V z [k}-1)>0. 

the case Vi[k] = 1 being excluded here. Note that the case Vi[k] = 1 corresponds to a 
singularity in the dynamics leading to effects such as sensibility to perturbations. In this 
section, we exclude this case setting e ik > (at the numerical level, this yields e ik > e, for 
some e > 0). 

If the previous condition e ik > is verified for all time index k and all neuron index 
i, then the spiking activity of the network exactly corresponds to the desired firing pattern. 
However, in this case, the membrane potential value may differ from one setting to another. 

Expanding (0, with the previous condition allows us to write, in matrix form: 



+ ^ > (5) 



writing: 



A, 


■t 


.. (2Zi[k] 


b t 


= (■■■ 


(2Zi[k\ 


w, 


= (... 


w ijd 






{2Zi[k] 



T-DxND 



-i) J2 T T J =o-f T Zj[k-T-d} ... eR 

-l)(/ ifcT -l) ...) f €R T ~ D , 
...) f eR ND , 
-l)(Vi[fc]-l) ...) f €R T ~ D , 

thus Ai — T>i Ci where is the non-singular j{ T - DxT ~ D diagonal matrix with T>^ k — 

2Zi[k] - l e {-1,1}. 

The weights are now thus directly defined by a set of linear inequalities for each neuron. 
This is thus a Linear Programming (LP) problem. See ifTTl for an introduction and (6) for 
the detailed method used here to implement the LP problem. 

Furthermore, the same discussion about the dimension of the set of solutions applies to 
this new paradigm except that we now have to consider a simplex of solution, instead of a 
simple affine sub- space. 
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A step further, < e ik is the "membrane potential distance to the threshold". Constrain- 
ing the e^fc is equivalent to constraining the membrane potential value Vi [k] . 
It has been shown in [8 | how: 

|e|oo = min inf (6) 
i k>0 

can be interpreted as a "edge of chaos" distance, the smallest |e| the higher the dynamics 
complexity, and the orbits periods. 

On the other hand, the higher e^, the more robust the estimation. If e ik is high, sub- 
threshold and sup-threshold values are clearly distinct. This means that numerical errors are 
not going to generate spurious spikes or cancel expected spikes. 

Furthermore, the higher |e|oo the smaller the orbits period [8|. As a consequence, the 
generated network is expected to have rather minimal orbit periods. 

In the sequel, in order to be able to use an efficient numerical implementation, we are 
going to consider a weaker but more robust norm, than |e|oo: 

\ e i\l=J2 e ik (7) 

k 

We are thus going to maximize, for each neuron, the sum, thus, up to a scale factor, the 
average value of e ik . 

Let us now derive a bound for e ik . Since < Vi[k] < 1 for sub-threshold values and 
reset as soon as Vi [k] > 1, it is easily bounded by: 

vr n = £ w ijd <vM<vr ax = E w a* 

jd,W ijd <0 jd,W ijd >0 

and we must have at least v™ ax > l in order for a spike to be fired while yJ n%n < by 
construction. These bounds are attained in the high-activity mode when either all excitatory 
or all inhibitory neurons fire. From this derivation, e max > and we easily obtain: 

n . ^ max /-I t 7-min vrmax -i\ 

< e ik < e = max(l - Vi , VJ - 1) 

i 

thus an explicit bound for e ik . 

Collecting all elements of the previous discussion, the present estimation problem writes: 

max e/e, with, < e ik < e max , and, = AjWj + (8) 

k 

which is a standard bounded linear-programming problem. 

The key point is that a LP problem can be solved in polynomial time, thus is not a 
NP-complete problem, subject to the curse of combinatorial complexity. In practice, this LP 
problem can be solved using the Simplex method (which is, in principle, NP-complete in 
the worst case, but) in practice, as fast as, when not faster, than polynomial methods. 
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Retrieving signed and delayed weights from the observation of spikes 

In order to illustrate how the present method is easy to adapt to miscellaneous paradigms, 
let us now consider the fact that the weight emitted by each neuron have a fixed sign, either 
positive for excitatory neurons, or negative for inhibitory neurons. This additional constraint, 
known as the "Dale principle" f38), is usually introduced to take into account the fact, that 
synaptic weights signs are fixed by the excitatory or inhibitory property of the presynaptic 
neuron. 

Although we do not focus on the biology here, it is interesting to notice that this addi- 
tional constraint is obvious to introduce in the present framework, writing: 
W ijd = S ijd W' d , with S ijd e {-1, 1}, and W* jd > 
thus separating the weight sign Sij d which is a-priory given and the weight value W*j d 
which now always positive. 

Then, writing: 

A ijkd — Aijk d S^jd 
the previous estimation problem becomes: 

max e k , with, < e ik < e max ,0 < W* jd < 1, and, e* = A* w* + (9) 

k 

which is still a similar standard linear-programming problem. 



Retrieving delayed weights and external currents from the observation of spikes 

In the previous derivations, we have considered the membrane currents I ik are inputs, i.e. 
are known in the estimation. Let us briefly discuss the case where they are to be estimated 
too. 

For adjustable non- stationary current I ik , the estimation problem becomes trivial. An 
obvious solution is W^ d — 0, I ik = 1 + a (Zi[k] — 1/2) for any a > 0, since each current 
value can directly drive the occurrence or inhibition of a spike, without any need of the 
network dynamics. 

Too many degrees of freedom make the problem uninteresting: adjusting the non- stationary 
currents leads to a trivial problem. 

To a smaller extends, considering adjustable stationary currents Ii also "eases" the es- 
timation problem, providing more adjustment variables. It is obvious to estimate not only 
weights, but also the external currents, since the reader can easily notice that yet another 
linear-programming problem can be derived. 

This is the reason why we do not further address the problem here, and prefer to explore 
in details a more constrained estimation problem. 



Considering non-constant leak when retrieving parametrized delayed weights 

For the sake of simplicity and because this corresponds to numerical observations, we have 
assumed here that the neural leak 7 is constant. The proposed method still works if the leak 
varies with the neuron and with time i.e. is of the form 7^, since this is simply yet another 
input to the problem. The only difference is that, in (0) and the related equations, the term 
7 r is to be replaced by products of 7^. 
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However, if 7 is a function of the neural dynamics, thus of the previous method 

must be embedded in a non linear estimation loop. Since we know from [13 ] that this de- 
pendency is numerically negligible in this context, we can propose the following loop: 

1. Fix at step t — 0, 7^, to initial values. 

2. k- Estimate the weights given leaks 7^. at k = 0, 1, ... 

3. k- Re-simulate the dynamics, given the weights and to obtain corrected values 7^. 

4. k- Smoothly modify 7^ +1 = (1 - v) jf t +1 + v 7^ 

5. k + 1 Repeat step 2,k- for k + 1 the convergence of this non-linear relaxation method 
being guarantied for sufficiently small v. See l48l for an extended discussion about this 
class of methods. 

This shows that considering models with leaks depending on the dynamics itself is no 
more a LP-problem, but an iterative solving of LP-problems. 



Retrieving parametrized delayed weights from the observation of spikes 

In order to further show the interest of the proposed method, let us now consider that the 
profile of the weights is fixed, i.e. that 

W ijd = W° i3 a T (d) with, e.g., a(d) = | e"* 
thus the weights is now only parametrized by a magnitude W°j , while the temporal profile 
is known. 

Here a T (d) is a predefined synaptic profile, while r is fixed by biology (e.g., r = 2 ms 
for excitatory connections and r = 10ms for inhibitory ones). Let us note that the adjust- 
ment of r would have been a much more complex problem, as discussed previously in the 
non-parametric case. 

This new estimation is defined by: 

e* = A?w?+bi >0 (10) 

writing: 

(2 Zi[k] - 1) Ed E^o l T Zj [k-r-d] a(d) . . . j G R T ~ D * N 

Wij ...) f eR N 

thus a variant of the previously discussed mechanisms. 

This illustrates the nice versatility of the method. Several other variants or combinations 
could be discussed (e.g. parametrized delayed weights from the observation of spikes and 
potential, ..), but they finally leads to the same estimations. 




About retrieving delays from the observation of spikes 

Let us now discuss the key idea of the paper. 

In the previous derivations, we have considered delayed weights, i.e. a quantitative 
weight value W^d at each delay d G {1, D}. 

Another point of view is to consider a network with adjustable synaptic delays. Such 
estimation problem may, e.g., correspond to the "simpler" model: 
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n 

Vi[k] = 7< Vi[k - 1] (1 - Z^fc - 1]) + J2 w ij z jl k ~ d ij] + Uk, 

3=1 

where now the weights Wij and delays are to estimated. 

As pointed out previously, the non-learnability of spiking neurons is known il5Ql . i.e. 
the previous estimation is proved to be NP-complete. We have carefully checked in j[5Ql 
that the result still apply to the present setup. This means that in order to "learn" the proper 
parameters we have to "try all possible combinations of delays". This is intuitively due to 
the fact that each delay has no "smooth" effect on the dynamics but may change the whole 
dynamics in a unpredictable way. 

We see here that the estimation problem of delays seems not compatible with usable 
algorithms, as reviewed in the introduction. 

We propose to elude this NP-complete problem by considering another estimation prob- 
lem. Here we do not estimate one delay (for each synapse) but consider connection weights 
at several delay and then estimate a weighted pondering of their relative contribution. This 
means that we consider a weak delay estimation problem. 

Obviously, the case where there is a weight Wij with a corresponding delay d^ e 
{0, D} is a particular case of considering several delayed weights Wijd (corresponding to 
have all equal weights to zero except at dij, i.e., W^d = if d = dij then Wij else 0). In 
other words, with our weaker model, we are still able to estimate a neural network with 
adjustable synaptic delays. 

We thus do not restrain the neural network model by changing the position of the prob- 
lem, but enlarge it. In fact, the present estimation provides a smooth approximation of the 
previous NP-complete problem. 

We can easily conjecture that the same restriction also apply of the case where the ob- 
servation of spikes and membrane potential is considered. 

We also have to notice, that the same restriction apply not only to simulation but, as 
far as this model is biologically plausible, also true at the biological level. It is thus an 
issue to wonder if, in biological neural network, delays are really estimated during learning 
processes, or if a weaker form of weight adaptation, as discussed in this paper, is considered. 



4 Methods: Exact spike train simulation 

4.1 Introducing hidden units to match any raster 
Position of the problem 

Up to now, we have assumed that a raster Zi[k],i £ {1, N}, k e {1, T} is to be generated by 
a network whose dynamics is defined by Q, with initial conditions Zj [k],j e {1, iV}, k <G 
{1, D} and Vj [0] = 0. In the case where a solution exists, we have discussed how to compute 
it. 

We have seen a solution always exists, in the general case, if the observation period is 
small enough, i.e., T < 0(N D). Let us now consider the case where T >> 0(N D). 

In this case, there is, in general, no solution. This is especially the case when the raster 
has not been generated by a network given by Q, e.g., in the case when the raster is random. 

What can we do then ? For instance, in the case when the raster is entirely random and 
is not generated by a network of type (QJ ? 
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The key idea, borrowed from the reservoir computing paradigm reviewed in the intro- 
duction, is to add a reservoir of "hidden neurons", i.e., to consider not N but N + S neu- 
rons. The set of N "output" neurons is going to reproduce the expected raster Zi [k] and the 
set of S "hidden" neurons to increase the number of degree of freedom in order to obtain 
T < 0((N + S) D), thus being able to apply the previous algorithms to estimate the optimal 
delayed weights. Clearly, in the worst case, it seems that we have to add about S — 0(T / D) 
hidden neurons. This is illustrated in Fig- EI 

In order to make this idea clear, let us consider a trivial example. 

Sparse trivial reservoir 

Let us consider, as illustrated in Fig. [3] S — T/D + 1 hidden neurons of index % £ {0, S} 
each neuron firing once at ty — % D, except the last once always firing (in order to maintain 
a spiking activity), thus: 

Z v [k] = S(i D - k), < i < S, Z s [k] = 1 

where S(k) = £{o,o}(^) i s me Kronecker symbol, as shown in Fig. [3] 
Let us choose: 

Wssi > 1 

W-fQl 

VV l'Sl 't.,-1/2 

7 (1-7 %') 

Wi>j'd — otherwise 

with initial conditions [k] = 0, i' G {0, S{ and Z s [k] = 1, k G {1, D}, while l i>k = 0. 

A straight-forward derivation over equation (Q]) allows to verify that this choice allows 
to generate the specified Zy [k]. In words, as the reader can easily verify, it appears that: 

- the neuron of index S is always firing since (though Wssi) a positive internal loop 
maintains its activity; 

- the neurons of index % £ {0, S{, whose equation writes: 

Vi> [k] = 7 Vi> [k ~ 1] (1 - Zi> [k - 1]) + Wysi + W Vilx Z v [k - 1] 

is firing at ty integrating the constant input Wysi\ 

- the neurons of index % G {0, S{, after firing is inhibited (though Wyyi) by a negative 
internal loop, thus reset at value negative low enough not to fire anymore before T. We 
thus generate Zy [k] as expected. 

Alternatively, the use of the firing neuron of index S can be avoided by introducing a 
constant current Iy^ = Wysi- 

However, without the firing neuron of index S or some input current, the sparse trivial 
raster can not be generated, although T < 0(N D). This comes from the fact that the 
activity is too sparse to be self-maintained. This illustrates that when stating that "a solution 
exists, in the general case, if the observation period is small enough, i.e., T < 0(N D)", a 
set of singular cases, such as this one, was to be excluded. 

The hidden neurons reservoir raster being generated, it is straight-forward to generate 
the output neuron raster, considering: 

- no recurrent connection between the N output neurons, i.e., — 0^ £ £ 

{l,N},de{l,D}, 
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- no backward connection from the N output neurons to the S hidden neurons i.e., W^jd — 

o,i' e{o,N{,j e{i,N},de{i,D}, 

- but forward excitatory connections between hidden and output neurons: 

Wij'd = (1 + e) Zi [f D + d] for some small e > 

yielding, from {TJ : 

Vi[k] = £?, =1 Ed=i w^'d z jf [k - d] 

= EjLi E?=i(l + <0 - 4 - (* - d)) 

= (1 + e) 

setting 7 = for the output neuron and 1^ = 0, so that Z^[fc] = i.e., the generated 

spikes Zi[k] correspond to the desired Zi[k], as expected. 

The linear structure of a network raster 

The previous construction allows us to state: given any raster ofN neurons and observation 
time T, there is always a network of size N + T/D + 1 with weights delayed up to D, which 
exactly simulates this raster. What do we learn from this fact ? 

This helps to better understand how the reservoir computing paradigm works: Although 
it is not always possible to simulate any raster plot using a ''simple" integrate and fire 
model such as the one defined in ([7), adding hidden neurons allows to embed the problem 
in a higher-dimensional space where a solution can be found. 

This results is induced by the fact that we have made explicit, in the previous section, 
that learning the network weights is essentially a linear (L or LP) problem. With this inter- 
pretation, a neuron spiking sequence is a vector in this linear space, while a network raster is 
a vector set. Designing a "reservoir" simply means choosing a set of neurons which spiking 
activity spans the space of expected raster. We are going to see in the next section that this 
point of view still holds in our framework when considering network inputs. 

This linear- algebra interpretation further explains our "trivial sparse" choice: We have 
simply chosen a somehow canonical orthonormal basis of the raster linear space. One con- 
sequence of this view is that, given a network raster, any other network raster which is a 
linear combination of this raster can be generated by the same network, by a simple change 
of weights. This is due to the fact that a set of neurons defining a given raster corresponds 
to the set of vectors spanning the linear space of all possible raster generated by this net- 
work. Generating another raster corresponds to a simple change of generating vectors in the 
spanning set. This also allows us to define, for a given raster linear space, a minimal set of 
generating neurons, i.e. a vectorial basis. The "redundant" neurons are those which spiking 
sequence is obtained by feed-forward connections from other neurons. 

We must however take care of the fact the numerical values of the vector are binary 
values, not real numbers. This is a linear space over a finite field, whereas its scalar product 
is over the real numbers. 

On optimal hidden neuron layer design 

In the previous paragraph, we have fixed the hidden neuron spiking activity, choosing a 
sparse ad-hoc activity. It is clearly not the only one solution, very likely not the best one. 

We thus may consider the following problem: given N output neurons and S hidden neu- 
rons what are the "best" weights W and hidden neuron activity Zy [k] allowing to reproduce 
the output raster. 
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By "best", we mean optimal in a precise sense defined in the previous section. In that 
case, instead of having to solve a LP-problem, as specified in ®, we have to consider a 
much more complicated problem now: 

- not a linear but bi-linear problem (since we have to consider the products of weights to 
estimate with spike activity to estimate, as readable in (Q}; 

- not a standard linear programming problem with real values to estimate, but a mixed 
integer programming problem with both integer values to estimate. 

This has a dramatic consequence, since such problem is known as being NP-hard, thus 
not solvable in practice, as discussed previously for the estimating of delays. 

This means that we can not consider this very general question, but must propose heuris- 
tics in order to choose or constraint the hidden neuron activity, and then estimate the weights, 
given the output and hidden neuron's spikes, in order to still consider a LP-problem. 

Let us consider one of such heuristic. 



A maximal entropy heuristic 

Since we now understand that hidden neuron activity must be chosen in order to span as 
much as possible the expected raster space, and since we have no a-priory information about 
the kind of raster we want to generate (we target here a "general" algorithm), the natural idea 
is to randomly choose the neuron activity with a maximal randomness. 

Although it is used here at a very simple level, this idea is profound and is related to 
random sampling and sparse approximation of complex signal in noise (see [43,44] for a 
didactic introduction), leading to greedy algorithms and convex relaxation [42,24]. Since 
inspired by these elaborated ideas, the proposed method is simple enough to be described 
without any reference to such formalisms. 

In this context, maximizing the chance to consider a hidden neuron with a spiking ac- 
tivity independent from the others, and which adds new independent potential information, 
simply corresponds to choose the activity "as random as possible". This corresponds to a 
so called Bernouilli process, i.e., simply to randomly choose each spike state independently 
with equi-probability. 

Since we want to simulate the expected raster with a minimal number of hidden neuron, 
we may consider the following algorithmic scheme: 

1. Starts with no hidden but only output neurons. 

2. Attempts to solve ^ on hidden (if any) and output neurons, in order to obtain weights 
which allows the reproduction of the expected raster on the output neurons. 

3. If the estimation fails, add a new hidden neuron and randomly draw its spiking activity 

4. Repeat step 2 and 3 until an exact reproduction of the expected raster is obtained 

Clearly, adding more and more random points to the family of generating elements must 
generate a spanning family after a finite time, since randomly choosing point in an affine 
space, there is no chance to always stay in a given affine sub-space. This means that we 
generate a spanning family of neuron after a finite time, with a probability of one. So that 
the algorithm converges. 

What is to be numerically experimented is the fact we likely obtain a somehow minimal 
set of hidden neurons or not. This is going to be experimented in section [51 
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4.2 Application: Input/Output transfer identification 

Let us now describe the main practical application of the previous algorithmic development, 
which is to "program" a spiking network in order to generate a given spike train or realize a 
given input/output spike train function. 

In the present context, this means finding the "right" or the "best" spiking network pa- 
rameters in order to map an input's set onto an output's set. 

What is pointed out here, is the fact that the previous formalism does not only apply to 
the simulation of a unique, input less, fully connected network, but is applicable to a much 
wider set of problems. 

In order to make this explicit, let us consider the following specification of spiking neural 
networks with units defined by the recurrent equation ([TJ. 

connectivity We now do not assume a fully connected network but consider a connection 
graph AC, i.e., some connections weights are zero. This writes: 

ViJ £1CC{0,N{ 2 ,Vd,W ijd = 

or in matrix form: 

Kw = (11) 

for a diagonal matrix K with ijd = if i, j £ AC and 1 otherwise, 
input current We consider that any neurons can be driven by an input current J^, thus defin- 
ing an "analog" input. 

input spikes We have also to consider that the network can also be driven by external in- 
coming spikes. In order to implement this feature in the present framework, we use the 
following trick. 

For each spike train input, we add an "input spiking neuron", thus with its state corre- 
sponding to the incoming spike train. This is obviously implementable in the previous 
network by considering Vi[k] — 1^, with I ik being a binary value I ik e {0,1 + e} 
for some small e > 0. This corresponds to neuron defined by (Q]) but with 7^ = and 
Vj, d, Wijd — 0, thus a trivial connectivity as introduced previously. 
This tricks allows to reduce the spike input specification to a current input specification, 
output neurons We consider that a subset of the neurons are output neurons, thus with a 
state which is readout and thus must be constrained, as defined in ([5]). Other neurons are 
hidden neurons as discussed in the previous section. 

As discussed previously, the best heuristics at this level of knowledge is to randomly 
generate the hidden neuron required activity, 
weighted estimation We further consider that depending on neuron and time the estima- 
tion requirement is not homogeneous, whereas there are times and neurons for which 
the importance of potential to threshold distance estimation differs from others. This 
generalized estimation is obvious to introduce in our formalism, defining: 

\*i\i,A = ^2Ak eikiAk > (12) 

k 

for some metric A. 

We further consider a "supervised learning paradigm" is the following sense. We now 
consider a family of L input current or spikes vectors: 

!' = (... 4 ...) t €E iVxT - D 
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to be mapped on family of output spike trains: 

Z' = (... Z t [k] 1 ..^€R NxT - D , 

given initial states: 

for / G {0, L{ and would like to find the right weights w allowing to perform this in- 
put/output mapping. The estimation problem is in fact strictly equivalent to ^ now con- 
catenating the input information (except the initial states), thus writing: 

A i =^.'.'.(2Z i [k] l -l) J2 T =r jk ^Zj[k-r-d\ 1 '.'.'^ eR L(T-D),N Dj 

b i = (... (2Z i [k] l -l)(I l ikr -l) €R L ^ T ~ D \ 

This formalism, thus now allows us to find an exact input/output mapping, adding hidden 
neurons in order to have enough degree of freedom to obtain a solution. 



4.3 Application: approximate Input/Output transfer identification 

Let us finally discuss how to apply the previous formalism to approximate transfer identifi- 
cation. We consider, in our context, deterministic alignment metrics defined on spike times 

Using alignment metric. In this context, the distance between two finite spike trains T, T' 
is defined in terms of the minimum cost of transforming one spike train into another. Two 
kinds of operations are defined: 

- spike insertion or spike deletion, the cost of each operation being set to 1 

- spike shift, the cost to shift from tf e T to t - m e T 1 being set to \t f - t i rn \/r for a time 
constant r. 

Although computing such a distance seems subject to a combinatorial complexity, it 
appears that quadratic algorithms are available, with the same complexity in the discrete 
and continuous case. 

For small r, the distance approaches the number of non-coincident spikes, since instead 
of shifting spikes it is cheaper to insert/delete non-coincident spikes, the distance being 
always bounded by the number of spikes in both trains. 

For high r, the distance basically equals the difference in spike number (rate distance), 
while for two spike trains with the same number of spikes, there is always a time-constant r 
large enough for the distance to be equal to J2 n ~ ^ n |/ r - 

It appears that the LP algorithms initial phase [ IV , 6 1 which attempts to find an initial 
solution before optimizing it, generates a divergence between the obtained and expected 
raster, this divergence being zero, if a solution exists. Furthermore, this divergence can be 
related to the present alignment metric, for high r, and on-going work out of the scope of 
this subject develops this complementary aspect. 

When considering spike trains with more than one unit, an approach consists to sum 
the distances for each alignment unit-to-unit. Another point of view is to consider that a 
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spike can "jump", with some cost, from one unit in T to another unit in T 1 . The related 
algorithmic complexity is no more quadratic but to the power of the number of units (4j. 

This family of metrics include aligments not only on spike times, but also on inter-spike 
intervals, or metrics which are sensitive to patterns of spikes, etc... They have been fruitfully 
applied to a variety of neural systems, in order to characterize neuronal variability and cod- 
ing Eoll . For instance, in a set of neurons, that act as coincidence detectors, with integration 
time (or temporal resolution) r, spike trains will have similar postsynaptic effects if they are 
similar w.r.t. this metric. Furthermore, this metric generalizes to metric with causality (at 
a given time, the cost of previous spikes alignment decreases with the obsolescence of the 
spike) and non-linear shift's costs [ 10 1. 



Application to approximate identification. Our proposal is to re-visit the maximal entropy 
heuristic algorithm defined previously and consider having a correct identification, if the 
distance between the expected and obtained raster is not zero (equality) but below a thresh- 
old. 

This allows to not only address: 

- the exact estimation problem, i.e. find an exact input/output mapping if and only if there 
is one, but also 

- the approximate estimation problem, i.e. to find an approximate input/output mapping 
that minimizes a given distance. 

This however, is a trivial strategy because the alignment distance is not used to find the 
optimal solution, but only to check wether this solution is acceptable. The reason of such a 
choice is easy to explain: alignment metrics, as it, are highly non-differenciable with respect 
to the network parameters. Therefore variational methods, considering, e.g., the criterion 
gradient in order to drive the parameters value towards a local optimum do not apply here. 

Several alternatives exist. One considers the spike time defined by the equation Vw(tf) = 
0, where V is the membrane potential, is the firing threshold, and W are the network pa- 
rameters to adjust ll36l . From the implicit function theorem it is obvious to relate locally 
dW and dtf, thus derive a parameter gradient with respect to the spike times. However, 
such method is numerically ill-defined, since the threshold value is not significant, but 
only a modeling trick. 

Another alternative is to consider convolution metrics |[TTL in order to relate the spike 
train to a differentiable signal, thus in a context where variational methods can be applied. 
One instantiation of this idea considers an abstract specification of the input/output methods, 
using piece-wise linear SRM models [21 1. This algebraic simplification allows to implement 
variational methods [49 27] in order to specify the network input/output relations. Such 
methods, however, are restrained to a given neuronal model and to a given coding (temporal 
coding in this case) of a continuous information using spike times. Such methods must thus 
be generalized in order to be applied in the present context. 

When the present method fails, it still provides an approximate solution with a "max- 
imal" number of correct spikes, by the virtue of the {8} minimization mechanism. Each 
"false" state corresponds to < (i.e. either a spurious spike or a missing spike), and it 
is an easy exercise to relate this to a particular alignment metric. We conjecture that this is a 
interesting track in order to generalize the present work, from exact estimation, to exact or 
approximate estimation. 
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5 Results 

5.1 Retrieving weights from the observation of spikes and membrane potential 

In a first experiment, we consider the linear problem defined in ([3]) and use the singular 
value decomposition (SVD) mechanism [ 20 ] in order to obtain a solution in the least-square 
sense. Here the well-established GS 10 library SVD implementation is used. 
This allows us to find: 

- if one or more solution, the weights of minimal magnitude |w^ | 2 = J2jd ^ijd'^ 

- if no exact solution, the solution which minimizes Y2^(Vi[k] — Vi[k]) 2 where Vi[k] is 
the membrane potential predicted by the estimation. 

The master and servant paradigm. 

We have seen that, if D (N + 1) > T, i.e., if the observation time is small enough for any 
raster there is a solution. Otherwise, there is a solution if the raster is generated by a model 
of the form of (Q]). We follow this track here and consider a master/servant paradigm, as 
follows: 

1. In a first step we randomly choose weights and generate a "master" raster. 

2. The corresponding output raster is submitted to our estimation method (the "servant"), 
while the master weights are hidden. The weights are taken from a normal distribution 

2 

A/"(0, j^) with 70% excitatory connections and 30% for inhibitory one. The standard 
deviation a £ [1, 10] has been chosen in order to obtain an interesting dynamics, as 
discussed in (8). 

The algorithm defined in © or in ([5]) thus receives a set of spikes for which we are sure that 
a solution exists. Therefore it can be used and leads to a solution with a raster which must 
exactly correspond to the master input raster. 

Note that this does not mean that the servant is going to generate the raster with the same 
set of weights, since several solutions likely exist in the general case. Moreover, except for 
the paradigm ©,the master and servant potential Vi [k] are expected to be different, since 
we attempt to find potentials which distance to the threshold is maximal, in order to increase 
the numerical robustness of the method. 

This is going to be the validation test of our method. As an illustration we show two 
results in Fig.|4]and Fig.[5]for two different dynamics. The former is "chaotic" in the sense 
that the period is higher than the observation time. 

In the non trivial case in Fig. [4] it is expected that only one weight's set can generate 
such a non-trivial raster, since, as discussed before, we are in the "full rank" case, thus with 
a unique solution. We observe the same weights for both master and servant in this case, 
as expected. This would not the case for simpler periodic raster, e.g. in Fig. [5] where the 
weight's estimation by the servant differs from the master's weights, since several solutions 
are possible. 

Retrieving weights from the observation of spikes and membrane potential has been 
introduced here in order to explain and validate the general method in a easy to explain 
case. Let us now turn to the more interesting cases where only the observation of spikes are 
available. 

2 http : //www. gnu . org/sof tware/gsl 
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5.2 Retrieving weights from the observation of spikes 

In this setup we still use the master / servant paradigm, but now consider the LP problem 
defined previously. The numerical solutions are derived thanks to the well-established im- 
proved simplex method as implemented in GLPP0. 

As an illustration we show two results in Fig. [6] and Fig. [7] for two different dynamics. 
Interesting is the fact that, numerically, the estimated weights correspond to a parsimonious 
dynamics in the sense that the servant raster period tends to be minimal: 

- if the master raster appears periodic, the servant raster is, with same period; 

- if the master raster appears aperiodic (i.e., "chaotic") during the observation interval, the 
servant raster is periodic with a period close (but not always identical) to the observation 
time T. This is coherent with the theoretical analysis of such networks |[8][T3l, ans is 
futher investigated in the sequel. 



5.3 Retrieving delayed weights from the observation of spikes 

In this next setup we still consider the same master/servant paradigm, for N = 50 units, 
with a leak 7 = 0.95 and an external current I = 0.3, but in this case the master delayed 
weight profile has the standard form shown in Fig. [8] 

In the case of trivial dynamics, it is observed that the estimated servant weight distri- 
bution is sparse as illustrated in Fig. [5] However, as soon as the dynamics is not trivial, the 
proposed algorithm uses all delayed weight parameters in order to find an optimal solution, 
without any correspondence between the master initial weight distribution and the servant 
estimated weight distribution. This is illustrated in Fig. [TT] where instead of the standard 
profiles shown in Fig. [S] a "Dirac" profile has been used in the master, while the estimated 
weights are distributed at all possible delays. In order to complete this illustration an non 
trivial dynamics is shown in Fig.fTOl 

On the complexity of the generated network 

Here we maximize © which is in relation with ©. The latter was shown to be a relevant nu- 
merical measure of the network complexity [8 , 13 1. We thus obtain, among networks which 
generate exactly the required master, the "less complex" network, in the sense of ©. A very 
simple way to figure out how complex is the servant network is to observe its generated 
raster after T, i.e., after the period of time where it matches exactly the required master's 
raster. They are indeed the same before T. 

After T, in the case of delayed weights, we still observe that if the original raster is 
periodic, the generated raster is still periodic with the same period. 

If the original raster is a-periodic, for small N and T, we have observed that the gener- 
ated master is still periodic, as illustrated in Fig.[l2j and this number roughly exponentially 
increases with N and T as predicted by the theory [8|. We however, have not observed any 
further regularity, for instance changes of regime can occur after the T delay, huge period 
can be observed for relatively small numbers of N and T, etc.. Further investigating this 
aspect is a perspective of the present work. 



3 http : //www. gnu . org/sof tware/glpk 
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5.4 Retrieving delayed weights from the observation of spikes, using hidden units 

In this last set of numerical experiments we want to verify that the proposed method in 
section [4] is "universal" and to evaluate the number of hidden neurons to be recruited in 
order to exactly simulate the required raster. If the answer is positive, this means that we 
have here available a new "reservoir computing" mechanism. 

Considering Bernoulli distribution 

We start we a completely random input, drawn from a uniform Bernoulli distribution. This 
corresponds to an input with maximal entropy. Here the master/servant trick is no more 
used. Thus, the raster to reproduce has no chance to verify the neural network dynamics 
constraints induced by {]]), unless we add hidden neurons as proposed in section[4] 

As observed in Fig. [13] we obtain as expected an exact reconstruction of the raster, 
while as reported in Fig.[l4] we need an almost maximal number of hidden neurons for this 
reconstruction, as expected since we are in a situation of maximal randomness, 

Considering correlated distributions 

We now consider a correlated random input, drawn from a Gibbs distribution |[T4l l9l. To 
make it simple, the raster input is drawn from a Gibbs distribution, i.e. a parametrized rank 
R Markov conditional probability of the form: 
P({Zi[k],l < i < N}\{Zi[k-l],l <i<N,l<l<R}) = ^exp($ x ({Zi[k-l],,l < i < N, > I > -R})) 
where $\ () is the related Gibbs potential parametrized by A and Z a normalization constant. 

This allows to test our method on highly-correlated rasters. We have chosen a potential 
of the form: 

*\{z\k=o) = r £*=i Z M + c t Eti nf= m + Ct ulo m 

thus with a term related to the firing rate r, a term related to temporal correlations Ct, and a 
term related to inter-unit correlation Q. 

We obtain a less intuitive result in this case, as illustrated in Fig. [15] event strongly 
correlated (but aperiodic) rasters are reproduced only if using as many hidden neurons as in 
the non-correlated case. In fact we have drawn the number S of hidden neurons against the 
observation period T randomly selecting 45000 inputs and have obtained the same curve as 
in Fig. [141 

This result is due to the fact that since the raster is aperiodic, thus non predictable 
changes occur in the general case, at any time. The raster must thus be generated by a 
maximal number of degrees of freedom, as discussed in the previous sections. 

In order to further illustrate this aspect, we also show in Fig.[l6]how a very sparse raster 
is simulated. We again obtain a solution with the same ratio of hidden neurons. Clearly 
the number of hidden neurons could have been less, as discussed in the previous sections. 
This shows that the algorithm is very general, but not optimal in terms of number of hidden 
neurons. 

Considering biological data 

As a final numerical experiment, we consider two examples of biological data set borrowed 
from (T) by the courtesy of the authors. Data are related to spike synchronization in a pop- 
ulation of motor cortical neurons in the monkey, during preparation for movement, in a 
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movement direction and reaction time paradigm. Raw data are presented trial by trial (with- 
out synchronization on the input), for different motion directions and the input signal is not 
represented, since meaningless for the purpose. Original data resolution was 0.1ms while 
we have have considered a 1ms scale here. 

What is interesting here is that we can apply the proposed method on non- stationary 
rasters, qualitatively very different, such as a very sparse raster, similar to the one shown in 
Fig. [16] a raster with two activity phases (presently movement preparation and execution) 
in Fig. [17] and a raster with a rich non- stationary activity in Fig. [18] In fact a dozen of such 
data sets have been tested, with the same result: exact raster reconstruction, with the same 
hidden unit ratio. 

On the computation time 

Since the computation time is exclusively the LP problem resolution computation time we 
have simply verify that we roughly obtain what is generaly observed with this algorithm, i.e. 
that the computation time order of magnitude is: 

O(ST) 

when N <CT, which is the case in our experimenation. On a standard laptop computer, this 
means a few seconds. 



6 Conclusion 

Considering a deterministic time-discretized spiking network of neurons with connection 
weights having delays, we have been able to investigate in details to which extend it is 
possible to back-engineer the networks parameters, i.e., the connection weights. Contrary to 
the known NP-hard problem which occurs when weights and delays are to be calculated, the 
present reformulation, now expressed as a Linear-Programming (LP) problem, provides an 
efficient resolution and we have discussed extensively all the potential applications of such 
a mechanism, including regarding what is known as reservoir computing mechanisms, with 
or without a full connectivity, etc.. 

At the simulation level, this is a concrete instantiation of the fact rasters produced by 
the simple model proposed here, can produce any rasters produced by more realistic models 
such as Hodgkin-Huxley, for a finite horizon. At a theoretical level, this property is reminis- 
cent of the shadowing lemma of dynamical systems theory [26], stating that chaotic orbits 
produced by a uniformly hyperbolic system can be approached arbitrary close by periodic 
orbits. 

At the computational level, we are here in front of a method which allows to "program" 
a spiking network, i.e. find a set of parameters allowing us to exactly reproduce the net- 
work output, given an input. Obviously, many computational modules where information 
is related to "times" and "events" are now easily programmable using the present method. 
A step further, if the computational requirement is to both consider "analog" and "event" 
computations, the fact that we have studied both the unit analog state and the unit event 
firing back-engineering problems (corresponding to the L and LP problems), tends to show 
that we could generalize this method to networks where both "times" and "values" have to 
be taken into account. The present equations are to be slightly adapted, yielding to a LP 
problem with both equality and inequality constraints, but the method is there. 

At the modeling level, the fact that we do not only statistically reproduce the spiking 
output, but reproduce it exactly, corresponds to the computational neuroscience paradigm 
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where "each spike matters" 1122111811 . The debate is far beyond the scope of the present work, 
but interesting enough is the fact that, when considering natural images, the primary visual 
cortex activity seems to be very sparse and deterministic, contrary to what happens with un- 
realistic stimuli [5 1. This means that it is not a nonsense to address the problem of estimating 
a raster exactly. 

As far as modeling is concerned, the most important message is in the "delayed weights 
design: the key point, in the present context is not to have one weight or one weight and 
delay but several weights at different delays". We have seen that this increase the compu- 
tational capability of the network. In other words, more than the connection's weight, the 
connection's profile matters. 

Furthermore, we point out how the related LP adjustment mechanism is distributed and 
has the same structure as an "Hebbian" rule. This means that the present learning mechanism 
corresponds to a local plasticity rule, adapting the unit weights, from only the unit and 
output spike-times. It has the same architecture as another spike-time dependent plasticity 
mechanism. However, this is supervised learning mechanisms, whereas usual STDP rules 
are unsupervised ones, while the rule implementations is entirely different. 

To which extends this LP algorithm can teach us something about how other plasticity 
mechanisms is an interesting perspective of the present work. Similarly, better understand- 
ing the dynamics of the generated networks is another issue to investigate, as pointed out 
previously. 

We consider the present approach as very preliminary and point out that it must be 
further investigated at three levels: 

i optimal number of hidden units: we have now a clear view of the role of these hidden 
units, used to span the linear space corresponding to the expected raster, as detailed in 
section |H This opens a way, not only to find a correct set of hidden units, but a minimal 
set of hidden units. This problem is in general NP-hard, but efficient heuristics may be 
found considering greedy algorithms. We have not further discussed this aspect in this 
paper, because quite different non trivial algorithms have to be considered, with the open 
question of their practical algorithmic complexity. But this is an ongoing work. 

ii approximate raster matching: we have seen that, in the deterministic case using, e.g., 
alignment metric, approximate matching is a much more challenging problem, since 
the distance to minimize are not differentiable, thus not usable without a combinatorial 
explosion of the search space. However, if we consider other metric (see B361I10I1 for a 
review), the situation may be more easy to manage, and this is to be further investigated. 

iii application to unsupervised or reinforcement learning: though we deliberately have 
considered, here, the simplest paradigm of supervised learning in order to separate the 
different issues, it is clear that the present mechanism must be studied in a more general 
setting of, e.g., reinforcement learning [39], for both computational and modeling is- 
sues. Since the specification is based on a variational formulation, such a generalization 
considering criterion related to other learning paradigms, seems possible to develop. 

Though we are still far from solving the three issues, the present study is completed in 
the sense that we not only propose theory and experimentation, but a true usable piece of 
softwar^l 
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Fig. 1 Schematic representation of a raster of iV neurons observed during a time interval T after an initial 
conditions interval D (in red). This example corresponds to a periodic raster, but non-periodic raster are also 
considered. See text for further details. 
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Fig. 2 Schematic representation of a raster of TV output neuron observed during a time interval T after an 
initial conditions interval D, with an add-on of S hidden neurons, in order increase the number of degree of 
freedom of the estimation problem. See text for further details. 
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Fig. 3 Schematic representation of a sparse trivial set of hidden neurons, allowing to generate any raster of 
length T. See text for further details. 
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Fig. 4 A "chaotic" dynamics with 20 neurons fully connected within network and simulation time T = 200, 
using both excitatory (70%) and inhibitory (30%) connections, with a = 5 (weight standard-deviation). 
After estimation, we have checked that master and servant generate exactly the same raster plot, thus only 
show the servant raster, here and in the sequel. 
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Fig. 5 A "periodic" dynamics with 20 neurons fully connected within network and simulation time T = 
200, using both excitatory (70%) and inhibitory (30%) connections, with a = 5. In this figure a periodic 
dynamics (9 periods of period 17) is observed, with 20 neurons fully connected within network and simulation 
time T = 200. Again the master and servant rasters are the same. 
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Fig. 6 Example of rather complex "chaotic" dynamics retrieved by a the LP estimation defined in (8) us- 
ing the master / servant paradigm with 50 neurons fully connected, initial conditions D = 10 and a time 
observation T = 200, used here to validate the method. 
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Fig. 7 Example of periodic dynamics retrieved by a the LP estimation defined in {8} using the master / 
servant paradigm, here a periodic raster of period 29 is observed during 4.76 periods. (N = 30, T = 150 
and D = 10) As expected from by the theory, the estimated dynamics remains periodic after the estimation 
time, thus corresponding to a parsimonious estimation. 
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Fig. 8 Weights distribution (positive and negative) used to generate delayed weights, with D = 10. 



Fig. 9 An example of trivial dynamics obtained with excitatory weights profiles shown in the top-left view 
(master weight's profile), with TV = 30, 7 = 0.98, D = 10 T = 70. The estimated weights profile (servant 
weight's profile) is shown in the top-right view. All weights have the same shape and value. To generate such 
trivial periodic raster, shown in the bottom view, only weights with a delay equal to the period have not zero 
values. This corresponds to a minimal edge of the estimation simplex, this parsimonious estimation being a 
consequence of the chosen algorithm. 
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Fig. 10 An example of non-trivial dynamics,with N = 30, 7 = 0.98, D = 10 T = 100. Profiles 
corresponding to the master's excitatory profiles are superimposed in the top-left figure, those corresponding 
to the master's inhibitory profiles are superimposed in the top-left figure. The estimated raster is shown in the 
bottom view. This clearly shows that, in the absence of additional constraint, the optimal solution corresponds 
to wide distribution of weight's profiles. 
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Fig. 11 In this figure we show that whatever be the weights and delays in the master, with N = 20, 
j = 0.98, D = 10 T = 100, the estimator use all the weights and delays for calculate the raster, in order to 
obtain an optimal solution. 
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Fig. 12 Two examples of observation of the raster period, on the slave network, observing the raster after 
the delay T where it matches the master raster (shown by an arrow in the figure), (a) With N = 10, 7 = 0.9, 
a = 3, D = 5, T = 50, a periodic regime of periode P = 4 is installed after a change in the dynamics, (b) 
With iV = 10, 7 = 0.9, a = 6, D = 5, T = 50, a periodic regime of periode P = 9 corresponds to the 
master periodic regime. 
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Fig. 13 Finding the expected dynamics from a raster with uniform distribution, (a), (c), (e) and (g) corre- 
spond to different raster with Bernoulli Distribution in addition (b), (d), (f) and (h) show the raster calculated 
by the methodology proposed. The red lines correspond to initial conditions (initial raster), the black ones 
are the spikes calculated by the method and the blues ones are the spikes in the hidden layer obtained with a 
Bernoulli Distribution. We can also observe that the number of neurons in the hidden layer increases, 1 by 1, 
between (b), (d), (f) and (h), this is because the observation time T is augmented, 5 by 5, as predicted. Here 
N = 4, 7 = 0.95, D = 5; in (a)(b) T = 25 with S = 0, in (c)(d) T = 30 with S = 1, in (e)(f) T = 35 
with S = 2, in (g)(h) T = 40 with S = 3, while S correspond to the number of neurons in the hidden layer, 
detailed in the text. 




Fig. 14 Relationship between the number of hidden neurons S and the observation time T, here N = 10, 
T = 470, D = 5,7 = 0.95 for this simulation. The right- view is a zoom of the left view. This curves shows 
the required number of hidden neurons, using the proposed algorithm, in order to obtain an exact simulation. 
We observe that S = ^ — N, thus that an almost maximal number of hidden neuron is required. This curve 
has been drawn from 45000 independent randomly selected inputs. 
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Fig. 15 Raster calculated, by the proposed method, from a highly correlated Gibbs distribution. Here r = 1, 
C t =0.5 and C % = 1. The other parameters are TV = 4, 7 = 0.95, D = 6, T = 200 with S = 29. The red 
lines correspond to initial conditions (initial raster), the black ones are the input/output spikes and the blues 
ones are the spikes in the hidden layer. 
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Fig. 16 Raster calculated, by the proposed method, from a very sparse raster, with iV = 20, 7 = 0.98, 
D = 5, T = 130 and S = 28. The hidden neurons derived by the present algorithm simply allow to 
maintain the network activity in order to fire the spare spikes at the right time. Color codes are the same as 
previously. 
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Fig. 17 Raster calculated, by the proposed method, from a biological data set, with TV = 20, 7 = 0.95, 
D = 5 T = 190 and S = 38. Color codes are the same as previously. See text for details. From (T) by the 
courtesy of the authors. 
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Fig. 18 Raster calculated, by the proposed method, from a biological data set, with TV = 20, 7 = 0.95, 
D = 5 T = 190 and S = 38. Color codes are the same as previously. See text for details. From |T| by the 
courtesy of the authors. 



